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Correlated variability in the spiking responses of pairs of neurons, also known as spike 
count correlation, is a key indicator of functional connectivity and a critical factor in 
population coding. Underscoring the importance of correlation as a measure for cognitive 
neuroscience research is the observation that spike count correlations are not fixed, 
but are rather modulated by perceptual and cognitive context. Yet while this context 
fluctuates from moment to moment, correlation must be calculated over multiple trials. 
This property undermines its utility as a dependent measure for investigations of cognitive 
processes which fluctuate on a trial-to-trial basis, such as selective attention. A measure 
of functional connectivity that can be assayed on a moment-to-moment basis is needed to 
investigate the single-trial dynamics of populations of spiking neurons. Here, we introduce 
the measure of population variance in normalized firing rate for this goal. We show using 
mathematical analysis, computer simulations and in vivo data how population variance in 
normalized firing rate is inversely related to the latent correlation in the population, and 
how this measure can be used to reliably classify trials from different typical correlation 
conditions, even when firing rate is held constant. We discuss the potential advantages 
for using population variance in normalized firing rate as a dependent measure for both 
basic and applied neuroscience research. 



Keywords: Spike count correlation, electrophysiology, population coding, variance, noise 



1. INTRODUCTION 

Cortical neurons emit variable spiking responses to repeated pre- 
sentations of an identical stimulus. This variability has often been 
taken to represent the stochastic output of Poisson-like spiking 
generation, and therefore been subjected to averaging to esti- 
mate the "true" neuronal response to the stimulus. It is also 
clear, however, that measurement of the variability and its rela- 
tionship to the mean response affords important insight into the 
mechanisms of neuronal coding. For instance, neuronal variabil- 
ity is reduced after stimulus onset in a bevy of cortical regions 
(Churchland et al, 2010), can reveal the neural mechanisms of 
decision-making (Churchland et al, 2011), and has been used 
to relate the activity of single neurons to behavior (for review 
see Nienborg et al, 2012). Our ability to understand variabil- 
ity, however, is constrained by single-neuron recordings. Noise 
in a neuron's responses to an identical stimulus might represent 
fundamental processes such as the diffusion of vesicles across a 
synapse (for review see Ribrault et al., 201 1), but it also may reflect 
the influences of other, unrecorded neurons on the spiking of the 
neuron under study. A deeper understanding of the mechanisms 
of neural coding requires measurement of the relative contribu- 
tion of these two factors, and in turn of how the interactions 
among neurons shape neural computations. 

With the advent and recent popularity of recording devices 
capable of measuring the spiking responses of many single 



neurons simultaneously, it has become possible to measure the 
statistical interactions among neurons. The "noise" that occurs in 
response to repeated presentations of an identical stimulus can be 
also manifest as correlated trial-to-trial variability among groups 
of neurons. This indicates that the source of this variance is not 
entirely synaptic or channel noise, but can be at least partially 
attributed to common influences across the observed neurons. 
Spike count correlation (r sc , or "noise" correlation) is there- 
fore indicative of a manner of functional connectivity between 
neurons. The structure of such correlated variability can reveal 
important features in the underlying neuronal circuits (Zohary 
et al, 1994; Shadlen and Newsome, 1998; Kohn and Smith, 2005; 
Smith and Kohn, 2008; Cohen and Maunsell, 2009; Mitchell et al, 
2009; Ko et al., 2011; Smith and Sommer, 2013). 

In addition to investigations of neuronal circuits, the magni- 
tude and structure of correlated variability has profound con- 
sequences on the information processing capacity of neuronal 
populations. Consider, for example, a population of neurons 
all encoding the same stimulus with identical tuning curves. 
Although each neuron's response is somewhat unreliable, this 
could potentially be mitigated by pooling across the population 
to average out the noise. If the noise is correlated across neu- 
rons, however, then it will not average out completely and the 
encoding of the stimulus will be affected. In this simple exam- 
ple, correlated variability is potentially detrimental for stimulus 
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encoding, although the full relationship depends on the struc- 
ture of correlation with respect to neuronal tuning curves (Abbott 
and Dayan, 1999; Averbeck et al, 2006), and is different in a 
model built from populations of neurons with heterogeneous 
vs. homogenous tuning curves (Shamir and Sompolinsky, 2006; 
Ecker et al., 2011). The magnitude of spike count correlation is 
not fixed, however, and the brain appears to modulate corre- 
lation in a manner that is contextually favorable. For instance, 
correlations are altered in populations of neurons after the onset 
of a visual stimulus (Kohn and Smith, 2005; Smith and Kohn, 
2008; Smith and Sommer, 2013), due to spatial attention (Cohen 
and Maunsell, 2009, 2011; Mitchell et al, 2009), perceptual 
learning (Gu et al., 2011), and association learning (Komiyama 
et al., 2010; Jeanne et al, 2013). Investigating the mecha- 
nisms responsible for neural correlations can therefore provide 
critical information about how the brain marshals processing 
resources in preparation for various perceptual and cognitive 
demands. 

The use of correlated variability to provide a window into 
the state of a network does have a substantial limitation: it is 
only defined over multiple repeated trials. This precludes the use 
of correlation to investigate cognitive and perceptual processes 
on a single-trial, or moment-to-moment, basis. One measure 
that can be assayed from a neuronal population on a single-trial 
basis, however, is the population variance in firing rate measured 
across all the simultaneously recorded neurons. We proved that 
this latter measure is inversely related to the magnitude of cor- 
relation latent in the neuronal population (see Appendix), and 
reasoned that it can be used as a single-trial measure of func- 
tional connectivity. This inverse relationship between single-trial 
population variance and cross-trial pairwise correlation is rooted 
in the well-known Law of Total Variance (Weiss et al, 2005). We 
will demonstrate the applicability of this measure for identifying 
different correlations in realistic conditions using computer sim- 
ulations and in vivo data. Our results can be taken as proof of 
concept that this method works in a simplified situation where 
firing rates are held constant and only correlation varies, although 
our mathematical derivation demonstrates that this relationship 
can be generalized to different firing rates. Thus, taking popula- 
tion variance into account will necessarily improve on using firing 
rate alone to estimate, on a single trial, cognitive states where 
correlation is a known variant. 

2. COMPUTER SIMULATIONS 
2.1. MATERIALS AND METHODS 

Generation of simulated datasets 

We investigated the ability by which population variance in nor- 
malized firing rate could discriminate between single trials drawn 
from low- and high-correlation regimes in simulated datasets. 
Specifically, we tested the sensitivity of classification to changes 
in population size and correlation difference between conditions. 
Simulated data were generated as thousands of trials of popula- 
tions of correlated Poisson spike counts using the Dichotomized 
Gaussian (DG) algorithm created by Macke and colleagues (2009; 
http://www.bethgelab.org/software/mvd) and recently used by a 
number of studies to simulate population responses of corre- 
lated neurons (Savin et al., 2010; Xue et al., 2010; Long and 



Carmena, 2011; Schaub and Schultz, 2012). ADG algorithm uses 
a multivariate normal distribution that is thresholded onto 0 or 
1 to construct a multivariate Poisson distribution, which pro- 
duces a known and invertible mapping of the input statistics. 
This algorithm finds the inverse mapping for specified output 
statistics and samples accordingly from the corresponding normal 
distribution. 

Populations of 120 neurons were simulated over 1000 tri- 
als for eight spike count correlation magnitudes evenly spaced 
between r sc = 0.05 and 0.40. As stated above, inputs to the algo- 
rithm included a mean firing rate vector and a full covariance 
matrix. The firing rate distributions of all simulated populations 
were matched with respect to the means and standard deviations, 
which were set equal to the overall statistics of the in vivo data 
(see following section). As a result, simulated datasets differed 
from each other only with respect to their latent correlations. 
For each simulated dataset, all of the pairs of neurons had the 
same specified correlation coefficient; given the scale of our sim- 
ulations {N = 120), this was necessary to ensure the algorithm 
reliably converged on a solution. Note that since correlation is 
not defined on a single-trial level, we use the term "latent cor- 
relation" throughout this report to indicate the correlation of the 
full dataset from which the single trial was drawn. 

Normalization and classification procedures 

In classifying single trials, we sought to measure the ability of an 
ideal observer to discriminate between trials drawn from either 
correlation condition based only on the population variance 
statistics. We employed classical receiver operating characteris- 
tic (ROC) analysis to classify single trials drawn from pairs of 
simulated datasets, each corresponding to a particular correla- 
tion condition. First, and in line with our mathematical proof 
(see Appendix), we normalized the firing rates for each neuron 
by z-scoring across responses for all trials of all conditions for 
that neuron (Figure 1A, top). That is, all trials of both the low- 
and high-correlation conditions were used in the same normal- 
ization in an effort to capture the full range of each neuron's 
activity. After normalization of firing rate, we calculated the pop- 
ulation variance (across neurons) of the z-scored responses for 
each single trial (Figure 1A, right). Trials with a low latent cor- 
relation tend to produce high population variance (blue), and 
trials with a high latent correlation tend to produce low popula- 
tion variance (red). Classification efficacy of population variance 
was determined using a sliding linear classifier with which we 
constructed ROC curves (Figure IB). In accordance with our 
rationale, we classified trials with population variance above the 
classifier threshold (high variance trials) to be from the low cor- 
relation condition, and trials with variance below the classifier 
threshold (low variance) to be from the high correlation condi- 
tion. Each point on the ROC curve represents the true positive 
rate of this rule against the false positive rate for a particular 
classifier threshold value. We also performed the identical classifi- 
cation procedure using mean normalized firing rate as a criterion 
to ensure no confounding firing rate effects. For both variance 
and firing rate analyses, classification ability was further quanti- 
fied from ROC curves by integration to yield area-under-curve 
(AUC) measurements. 
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FIGURE 1 | Schematic of classification procedure using example 
simulated data. Throughout the figure, red color represents data from 
the high-correlation condition and blue color represents data from the 
low-correlation condition. (A) Top: z-scored firing-rate distribution for 
one example simulated neuron. Trials from both correlation conditions 
are included. The solid line represents a Gaussian fit. Lower left: 
Matrix of data from all neurons (columns) and trials (rows). The 
lightness of the monochrome shading represents the normalized firing 
rate for the given neuron on the given trial. Right: example 
distributions of normalized firing-rates on single trials (top right: low 



correlation trial; bottom right: high-correlation trial). The variance of 
each distribution (s 2 ) is the single-trial population variance. (B) 
Distributions of population variances for high and low correlation 
conditions for the simulated data illustrated in (A). The arrows indicate 
the specific variances of the two single-trial distributions of normalized 
firing rates in (A). Note that overall, the population variance for the 
high correlation condition (red) is lesser than the population variance 
for the low correlation condition (blue). We classified single trials by 
using a sliding threshold (e.g., the dashed line). Values above this 
threshold were classified as low correlation. 



Classification of simulated data sets 

We constructed pairs of datasets that varied on two parameters of 
interest: population size and difference in average pairwise corre- 
lation between conditions (Ar sc ). To vary population size while 
holding correlation difference constant, we started with a single 
pair of datasets with a correlation difference fixed at a value of 0.05 
(r sc = 0.05 vs. r sc = 0.10), approximately equal to that which we 
observed in vivo. We tested this same value for Ar sc for a range 
of baseline correlation values and did not observe a systematic 
difference in the results. We then randomly subsampled this pair 
of datasets to build populations ranging in size from 10 to 120 
neurons. To vary Ar sc while holding population size constant, we 
fixed one dataset at r sc = 0.05 and paired it with datasets rang- 
ing from r sc = 0.05-0.4, such that the pairs produced differences 
in correlation ranging from Ar sc = 0-0.35. We subsampled neu- 
rons from these datasets to match population sizes to the average 
population sizes reflected in our in vivo data, approximately 80 
neurons. We classified all single trials from these pairs of datasets 
by both variance in normalized firing rate and mean normalized 
firing rate using the ROC methods given above. To reduce sam- 
pling error, AUC measures and ROC curves were constructed by 
averaging over 40 random subsamples. 

2.2. RESULTS 

Our simulated data demonstrated that classification ability, mea- 
sured by the area under the ROC curves (AUC), increases with 
both difference in latent correlation (Ar sc ) between and popula- 
tion size (N) of pairs of datasets (Figure 2A). It should be noted 



that classification using mean normalized firing rate was com- 
pletely random (AUC ~ 0.5) for any Ar sc or N, reflecting the 
identical firing rate terms we fed into the model. We fit the rela- 
tionship between AUC and Ar sc , holding the number of neurons 
fixed at 80 (Figure 2B, dashed line), with a first order rational 
function in the ordinary least squares sense. The fitted equation 
was AUC = 1 If +0 86 2 (# 2 = 0.9992). As we understood that 
latent correlation in a population of neurons was related to the 
single-trial variance across the population, it followed directly 
that increasing the difference in latent correlation between two 
populations increased our ability to discriminate between them 
with the population variance measure. Even though the relation- 
ship appeared linear on the interval over which we measured it, 
we fit a rational function with the knowledge that AUC has an 
intrinsic maximum of 1. 

We likewise fit the relationship between AUC and popula- 
tion size (N) with a first-order rational function, holding Ar sc 
fixed at 0.05 (Figure 2C, dashed line). The resulting fit was 
AUC = °- 62 x+ 5 f A (R 2 = 0.9819). Unlike the latent correlation 
parameter (Ar sc ), the rational shape of the curve was evident 
on the interval over which we investigated it. Notably, after the 
population exceeded 70 neurons, a plausible number of neu- 
rons to be found on a single 96-electrode array, classification 
ability only increased slowly. This indicates that a sample size 
of roughly 70 neurons might serve as a target for researchers 
interested in using population variance as a proxy for latent 
correlation, as sample sizes below this level (e.g., as might be 
obtained with a linear electrode array or a tetrode) would provide 
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FIGURE 2 | Classification performance of population variance in 
normalized firing rate for simulated data. We produced pairs of data sets 
for each classification condition by varying spike count correlation only. (A) 
Variance classification ability as a function of population size (/V) and 
difference in spike count correlation (Ar sc ). Data sets were generated with 
the same number of neurons (W = 120), with one set having a spike count 
correlation of r sc = 0.05 and the other having a value of r sc = 0.05 + Ar sc . 
(B) Classification ability of variance measure as a function of difference in 
noise correlation between conditions (Ar sc ) with population size fixed at 
N = 20, 40, 80, or 120 neurons. (C) Classification ability of variance 
measure as a function of population size (W) with a difference in noise 
correlation fixed at Ar sc = 0.05, 0.10, 0.20, or 0.35. 



a less reliable estimate of the true population variance around the 
recording site. 

Despite the fact that single-trial variance in normalized popu- 
lation firing rate is a noisy approximation for latent correlation, 
we were able to classify trials into two correlation conditions with 
better-than-chance performance even for the smallest non-zero 
differences in correlation tested and with the smallest number 
of simulated neurons tested (N = 20). However, classification 
performance improved rapidly as the number of neurons sam- 
pled increased to around N = 70, at which point performance 
increases tapered off substantially. Across all sample sizes tested, 
classification ability improved roughly linearly with the difference 
in latent correlation for the range of values that we tested, which 
were chosen to be physiologically plausible. 

3. In vivo DMA 

3.1. MATERIALS AND METHODS 

Subjects and surgical preparation 

We recorded neuronal activity from two adult male Rhesus 
macaques (Macaca mulatto). Surgeries were performed in asep- 
tic conditions and under isofluorane-induced anesthesia. Each 
animal was implanted with a head fixation post and a "Utah" 
microelectrode array (Blackrock Microsystems, Salt Lake City, 
UT) in visual area V4 of the right hemisphere. Details of array 
implantation in V4 have been reported previously (Smith and 
Sommer, 2013). Each microelectrode array consisted of a 10 x 10 
grid of silicon electrodes (each 1 mm in length) spaced 400 [im 
apart. All procedures were approved by the Institutional Animal 
Care and Use Committee of the University of Pittsburgh and 



were in compliance with the guidelines set forth in the National 
Institutes of Health's Guide for the Care and Use of Laboratory 
Animals. 

Data collection 

Signals from each microelectrode in the array were amplified and 
band-pass filtered (250-7500 Hz) by a Ripple (Salt Lake City, UT) 
data acquisition system. For each electrode in the array, wave- 
form segments that exceeded a threshold (periodically adjusted 
using a multiple of the RMS noise on each channel) were digitized 
(30 kHz) and stored for offline analysis and sorting. Waveform 
segments were sorted with an automated algorithm that clustered 
similarly shaped waveforms using a competitive mixture decom- 
position method (Shoham et al, 2003). We manually refined 
the output of this algorithm for each electrode with custom 
time-amplitude window discrimination software ("Spikesort"; 
Kelly et al, 2007; http://smithlab.net/spikesort.html; written 
in MATLAB; Math Works, Natick, MA), taking into account 
the shape of the waveform and the distribution of inter- 
spike intervals. Data from each session were sorted separately. 
Following the offline sorting procedure, we computed the 
signal-to-noise ratio (SNR) of each candidate unit as the ratio 
of the average waveform amplitude to the SD of the wave- 
form noise (Nordhausen et al., 1996; Suner et al., 2005; Kelly 
et al, 2007). Candidates that fell below an SNR of 2.0 were 
discarded. 

Task 

To present visual stimuli we used custom software written in 
MATLAB using the Psychophysics Toolbox extensions (Brainard, 
1997; Kleiner et al., 2007). All stimuli were displayed on a CRT 
monitor with a resolution of 1024 x 768 pixels and a temporal 
refresh of 100 Hz viewed at a distance of 36 cm. We used lookup 
tables to correct for non-linearities in the relation between input 
voltage and phosphor luminance in the monitor. The mean lumi- 
nance of the display was 39cd/m 2 . All stimuli were presented in 
a circular aperture surrounded by a gray field of average lumi- 
nance. For each animal, we mapped the spatial receptive fields of 
each channel by presenting small, drifting sinusoidal gratings at a 
range of spatial positions. We centered our stimuli on the aggre- 
gate receptive field of the recorded units. We also measured the 
spatial- and frequency-tuning of the recorded neurons for each 
animal by presenting drifting gratings at a range of spatial and 
temporal frequencies. For the experiment, we used the spatial 
and temporal frequency combination that maximally excited the 
recorded neurons. 

The animals completed a simple saccade task. We monitored 
the animals gaze positions with an Eyelink 2000 infrared tracking 
system (SR Research, Mississauga, ON). Animals were required to 
maintain fixation within 1.2° of a 0.15° blue dot for 2 s. During 
this period, a drifting sinusoidal grating was presented at the 
aggregate receptive field area for the recorded neurons. For the 
analyses described below, we used the spike counts over the inter- 
val from 0.5 to 1.5 s following stimulus onset. After the 2 s fixation 
interval had elapsed, the fixation point was extinguished and a 
new dot was presented 8° away from the original fixation point 
in a random direction. The animals received a liquid reward for 
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making a saccade to the new location. We presented gratings at 
two orientations (horizontal and vertical) in a block-randomized 
fashion to reduce the effects of stimulus adaptation. 

Surrogate data construction 

We predicted that population variance alone could be used to 
classify single trials originating from conditions of different latent 
spike count correlations. In practice, however, firing rate and 
variance are often conflated due to the quasi-Poisson nature of 
neuronal spiking activity. Nevertheless, our model predicts that 
population variance would be inversely related to latent correla- 
tion even if firing rate is held constant. To test our prediction, 
it was therefore necessary to obtain datasets that differed with 
respect to correlation but did not differ with respect to fir- 
ing rate. For this purpose, we leveraged a known property of 
spike count correlations in visual cortex; namely, that correlation 
between a pair of neurons decreases as the distance between those 
neurons increases (Smith and Kohn, 2008; Smith and Sommer, 
2013). Accordingly, we derived variance estimates for two con- 
ceptually different subpopulations of our full sample of neurons: 
1) neurons located relatively near each other, and 2) neurons 
located relatively far from each other. Specifically, the "far" set was 
predicted to have lower correlation than the "near" set. 

Our procedure is illustrated in Figures 3A-C. To create the 
"near" and "far" datasets we repeatedly drew subsamples from 
each trial in a session in groupings of four channels that were 



either 0.4-0.57 mm apart ("near"; blue elements in Figure 3) or 
1.6-2.26 mm apart ("far"; red elements in Figure 3). Using the 
"near" dataset as an example, we moved the blue square to a posi- 
tion on the array and drew a sample of neurons at that position. 
The sample was discarded if neurons were not present on at least 
2 channels. From that sample, we calculated the population vari- 
ance and stored that value. We repeated this procedure for all 
possible positions of the blue and red squares (81 for "near" and 
36 for "far"). We then averaged all the observed population vari- 
ances to obtain a single value of population variance for the trial 
at each of the two distances, and repeated this procedure for every 
trial. This method was akin to a bootstrap resampling approach 
to estimate the variance. That is, we repeatedly sampled from 
the total population of all recorded neurons that were a partic- 
ular distance from each other to derive a sampling distribution of 
variances, and used the mean of that bootstrapped sampling dis- 
tribution as our estimate for the variance of the population of cells 
located at the given distance from each other. We tested different 
electrode configurations for the "near" and "far" conditions and 
found that the results were qualitatively robust for a wide range of 
configurations. In addition to calculating an estimate of variance 
for each condition, we also calculated the average spike count cor- 
relation for each condition, which allowed us to validate whether 
the two sets did indeed exhibit different magnitudes of correla- 
tion. We calculated the average firing rate for each condition to 
confirm that it was matched. Finally, we classified trials from the 
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FIGURE 3 | In vivo data sampling procedure. (A) Single datasets 
preliminarily sampled by sliding near (blue) and far (red) kernels across 
the array to gather all candidate four-channel groupings. The 10 x 10 grid 
represents the configuration of the microelectrode array. The number in 
each box indicates the number of neurons recorded on the corresponding 
channel in a representative dataset from Monkey Bo. (B) If the two 
kernels produced different numbers of valid groupings (i.e., at least two 
channels with neurons), the larger set was subsampied to match the 
smaller set in the number of groupings, keeping only the groupings with 
the most neurons. (C) Neuron counts were reduced to match across 



kernels. When individual channels had more than one neuron, we 
selected the one with the highest SNR and discarded the others to 
exclude same-electrode pairs. (D) Neurons were normalized according to 
the full history of their recorded trials and the variance and mean were 
calculated across all normalized samples (s 2 and x, respectively). (E) 
Variance and mean estimates were averaged within kernels to obtain an 
estimator of population variance and mean firing rate in populations of 
nearby and distant groups of neurons. (F) This procedure was repeated 
for all trials from each session to construct histograms of population 
variance and mean firing rate for eventual classification. 
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A Variance Classification 
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Monkey Wi 
Monkey Bo 




False Positive Rate 

FIGURE 4 | ROC curves measuring classification ability across two 
animals. (A) Sample ROC curves using population variance in normalized 
firing rate as a classifier. A single example dataset is plotted for each animal 
(corresponding to the datasets plotted with filled symbols in Figure 5). (B) 
Sample ROC curves using mean normalized firing rate for the same 
datasets as in panel (A). 



two conditions using a procedure identical to that described for 
the computer simulations above. All of the trials collected were 
used for the classification analysis (mean of 837 trials per session). 

3.2. RESULTS 

We analyzed 25 days of data from two monkeys, with 10 days for 
monkey Bo and 15 days for monkey Wi (typically 500-1300 trials 
per day). We found that the magnitude of the difference in latent 
correlation ( Ar sc ) from near and far populations varied substan- 
tially; however, for all but 1 day, the difference we obtained was 
greater than zero. This non-zero difference in correlation between 
near and far conditions was reliably classified using the popula- 
tion variance measure, while mean firing rate, by construction, 
performed with random classification. Sample ROC curves from 
one example dataset of each monkey with robust differences in 
correlation are shown in Figure 4. These curves correspond to 
datasets with Ar sc values of 0.0476 and 0.0467 for monkeys Bo 
and Wi, respectively. 

As Figure 5 illustrates, the average difference in latent correla- 
tion measured with the near- far sampling method was in line with 
our expectation based on previously published V4 data (Smith 
and Sommer, 2013) at distances of 0.5 mm for near pairs and 
1.5 mm for far pairs. For monkeys Bo and Wi, the average dif- 
ference in correlation between the near and far conditions was 
Ar sc = 0.0183 and 0.0310, respectively. Despite these relatively 
small differences in r sc , we found that our algorithm was able to 
classify the near and far conditions based on population variance 
using the same procedures outlined above for simulated data. 

We quantified classification performance as the area under the 
ROC curve. Classification using population variance was signif- 
icantly better than chance (i.e., AUC > 0.5, right-tailed f-test, 
Bo: p = 0.0362, Wi: p = 0.0061). In contrast, classification was 
not better than chance when based on mean population firing 
rate (f-test, Bo: p = 0.4676, Wi: p = 0.4436). This was expected 
because our resampling procedure was blind to firing rate, and 
there is no principled reason why the near or far pairs from the 
same array would differ in firing rate. In our simulations we found 
that classification performance was related to the magnitude of 
the r sc difference — a larger difference in r sc made for better clas- 
sification. In the in vivo data we also found this to be the case. 
Classification ability was related to the size of the difference in 
latent correlation measured for the datasets (Pearson correlation: 
r = 0.6100, p = 0.0012, Figure 5). We observed a substantially 
outlying data point from Monkey Wi (Figure 5, with a large Ar sc 
of 0.077 and an incommensurately low AUC of 0.527); to allay the 
influence of outlying data, we fit a rational function to the data 
using a least absolute residual robust fit (black dashed line). The 
resulting fit was AUC = "'a^+o "g 18 ( r1 = 0.3472). For compar- 
ison with the simulation results, the first-order rational fit to the 
simulated data is also plotted (solid gray line). Although our in 
vivo data lie only in a restricted range of the simulation results, 
they agree well with the expected classification performance in 
that range. 

4. DISCUSSION 

In this report, we showed mathematically that the single-trial 
population variance in normalized firing rate is complemen- 
tary to the latent correlation in a population of neurons (see 



Appendix), and then tested this relationship in both simulated 
data and in vivo data. We found that population variance can 
perform moderately well at classifying data arising from dif- 
ferent latent correlation conditions, even when the difference 
in correlation between those conditions is quite small, using 
physiologically plausible values of r sc and realistically attain- 
able numbers of neurons and trials. Using the simulations as a 
guide, we examined the relationship between population vari- 
ance and r sc in vivo and found a close match in classification 
performance. 
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FIGURE 5 | Classification ability, measured by the area under the ROC 
curve (AUC), increases with absolute difference in correlation between 
conditions (| Ar sc |; r = 0.610, p= 0.001). Scatter plot shows 10 and 15 
days of in vivo recordings across two monkeys, Wi (green circles) and Bo 
(orange triangles), respectively. The black dashed line is a rational fit to the 
data in the least absolute residual sense (ft 2 = 0.347), to mitigate the effect 
of the outlier from Monkey Wi data. The gray solid line is a rational fit to the 
simulated data (see Figure 2B, N = 80 trace) in the least squares sense 
(ft 2 = 0.999). The filled symbols represent the example data sets 
presented in Figure 4. 



Noise in neuronal responses can be viewed fundamentally in 
two ways, either as a nuisance to be averaged out over repeated 
trials or as signal that we have yet to uncover. Noise in neu- 
ronal responses has been related to neuronal computation and 
connectivity (Shadlen and Newsome, 1998), motor preparation 
(Churchland et al, 2006), decision-making (Churchland et al., 
2011), and the limits of perception (Parker and Newsome, 1998). 
From the perspective of single-neuron recordings, it is difficult 
to separate true noise from meaningful functional interactions 
among neurons. A fundamental goal of neuroscience is to under- 
stand how neurons interact to encode and process information. 
This goal requires the observation of multiple neurons simulta- 
neously so their interactions can be measured, and the noise in 
a neuronal population becomes a window into those functional 
interactions. As multielectrode recording technology continues to 
advance, it has become possible to record from dozens of neu- 
rons at a time, and these numbers are almost certain to increase. 
Making use of simultaneous recordings from large neuronal pop- 
ulations requires a framework of statistical understanding of 
neuronal interactions that can be built and tested using current 
recording methods. 

Spiking activity in pairs of cortical neurons is correlated on 
a range of spatial and temporal scales (Smith and Kohn, 2008). 
This correlated variability has a powerful influence on popu- 
lation coding and information processing capacity in neuronal 
networks, the extent and sign of which depends on its proper- 
ties (Zohary et al, 1994; Shadlen and Newsome, 1998; Abbott 
and Dayan, 1999; Averbeck et al., 2006). Furthermore, a series 



of recent studies have shown that correlated variability can be 
modulated by instruction and experience (Cohen and Maunsell, 
2009; Mitchell et al, 2009; Komiyama et al., 2010; Gu et al, 201 1; 
Jeanne et al, 2013). Given the interaction between correlated vari- 
ability and population coding, these findings suggest that spike 
count correlations might be indicative of the processing state of a 
particular population of neurons, or perhaps even actively mod- 
ulated by the brain in order to improve processing in the relevant 
neuronal population. Unfortunately, correlated variability comes 
with a substantial limitation in that it can only be assayed over 
multiple repeated trials of an identical stimulus. In this report, 
we described how single-trial variance in normalized firing rate is 
inversely related to latent correlation, and we proposed that pop- 
ulation variance might function as a single-trial proxy for spike 
count correlation. 

Single-trial measures and trial- averaged measures have com- 
plementary strengths and weaknesses. While single-trial measures 
are typically noisier than their trial-averaged counterparts, they 
can reveal trial-to-trial dynamics of the phenomenon of interest. 
For example, an all-too-familiar aspect of attention performance 
is that it tends to lapse. However, studies of attention that use 
trial-averaged measures (such as mean reaction time, hit rate 
accuracy, or spike count correlations) implicitly assume that 
attention is deployed consistently across all the trials within a 
particular attention condition. While efforts such as these have 
certainly paid dividends, single-trial measures are needed to study 
within-condition fluctuations such as the dynamics of attentional 
deployment on a sub-second time scale. A measure of neuronal 
population response in terms of firing rate has recently been 
employed for a similar aim (Cohen and Maunsell, 2010, 2011). 
This is possible, of course, because attention modulates firing 
rates as well as correlated variability. Another example of the 
strengths of single-trial analysis is provided by the field of motor 
prosthetics, where both firing rate and higher-order statistics can 
be used to decode a motor action (Yu et al., 2009). We suggest that 
population variance can be used to study neuronal interactions in 
single-trial contexts, serving as a near instantaneous estimate of 
the latent correlation in a population in much the same way as 
spike count correlation has been used across trials. This measure 
could be employed in conjunction with firing rate or in isolation 
depending on the response characteristics of the population in the 
conditions under study. 

Despite the good general correspondence between the in vivo 
data and the predictions of the simulations, the trend for our 
observed relationship between classification performance and 
Ar sc was slightly below that which was predicted (compare black 
dashed line to solid gray line in Figure 5). Several factors may 
contribute to this difference. Firstly, it may be that our simulated 
data did not match the actual neuronal data with respect to crit- 
ical statistics that we did not explicitly control. We constructed 
our simulated data sets to match our in vivo data as closely as 
possible (e.g., number of neurons and average firing rate and 
correlation). Using the published algorithm of Macke and col- 
leagues (2009), we were mostly successful in this aim, although 
an exact match on some of the desired parameters was not pos- 
sible. Specifically, the variance in the pair- wise correlation within 
our simulated datasets tended to be much lower than what was 
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observed in the real data, which may have affected classification 
performance. Effects carried by higher-order statistics of the in 
vivo data, such as skew or kurtosis, that we did not match with 
our simulations may have also affected classification. This does 
not present an inherent drawback, however, as future efforts may 
be able to leverage these higher moments for additional classifi- 
cation power, much as we have aimed to do here with population 
variance. Secondly, we implicitly assumed that latent correlation 
was constant for neurons located at a specific distance from each 
other. In actuality, and as we described in detail above, correla- 
tion is known to fluctuate due to a number of cognitive factors. 
Within-condition variance in latent correlation due to fluctua- 
tions in factors such as arousal, for example, would tend to reduce 
our classification performance compared to the simulation pre- 
dictions. Nevertheless, our simulation findings match fairly well 
with the in vivo data (Figure 5), particularly given the restricted 
range of Ar sc values we observed, and this result strengthens our 
confidence in their predictive value. 

For our simulation and in vivo data analyses we aimed to keep 
firing rate constant between conditions. We chose this in our sim- 
ulations, and ensured it was true in the in vivo data by statistical 
comparison, to focus on the ability to classify correlation condi- 
tions on the basis of variance alone. Of course, most real world 
situations will present some differences in firing rate across con- 
ditions. It is important to take these firing rate differences into 
account because cross-trial variance in individual neurons tends 
to be directly related to firing rate due to their quasi-Poisson 
characteristics (Tolhurst et al., 1983). However, we should empha- 
size that cross-trial variance of individual neurons is theoretically 
an independent quantity from the variance in normalized firing 
rate across all neurons on a single trial (population variance). 
Intuitively, a single neuron might be highly variable or highly 
reliable, and that knowledge gives us no information about its 
covariance with the rest of the population. Thus, while the inter- 
action between firing rate and the variance of individual neurons 
(Tolhurst et al., 1983) and the covariance of pairs of neurons (de la 
Rocha et al., 2007) have both been studied, it remains unclear how 
population variance scales with firing rate in real systems. This 
topic will require further research, and likely will require a mul- 
tivariate approach to apply population variance as a measure of 
latent correlation in dynamic contexts. Single-trial approaches to 
parceling out the relative contributions of shared and indepen- 
dent variability to neuronal responses (Yu et al., 2009) may prove 
useful in disentangling cross-trial variance in individual neurons 
from population variance. 

Our in vivo data were chosen and analyzed based on the pre- 
viously demonstrated effect that distance affects correlation in 
V4 (Smith and Sommer, 2013). We used a resampling procedure 
to identify near and far pairs of neurons in the data, with the 
result that firing rates were well matched but correlation differed. 
While the differences in correlation that we observed were small 
( Ar sc ~ 0.03; Figure 5), they were similar to what has been pre- 
viously reported for correlation as a function of distance in V4 
(Smith and Sommer, 2013). For these small differences in corre- 
lation we found a small but reliably better-than-chance classifi- 
cation performance (Figure 5). We also found that, as expected, 
greater differences in correlation lead to improved classification 



performance, and in a manner that closely matched the predic- 
tions of our computer simulations. If we consider other experi- 
mental contexts where larger differences in correlation have been 
reported, our computer simulations predict greatly improved 
classification performance. For example, Cohen and Maunsell 
(2011) found changes in correlation as large as Ar sc ~ 0.1 in V4 
during an attention task, while Gu and colleagues (2011) found 
changes in correlation as large as Ar sc ~ 0.12 in area MST fol- 
lowing a perceptual learning task. Our simulations predict that for 
effects of these magnitudes, classification performance approach- 
ing AUC = 0.7 would be achieved with around 120 neurons a 
quantity that is not unattainable even today, and which will likely 
be commonplace in the near future. 

In this report we advanced the idea that population variance in 
normalized firing rate can be a useful tool for single-trial analy- 
ses. Since population variance is intimately related to correlation, 
variance itself is indicative of functional connectivity within a 
population of neurons. This is of paramount importance because 
the computational power of the brain comes from the interac- 
tion of neurons, and is best studied in the context of neuronal 
interactions. Single-trial measures are equally important because 
they enable us to study the dynamic fluctuations that inevitably 
characterize perceptual and cognitive processes as the brain inter- 
prets information from the natural world. We suggest that the 
use of variance as a fulcrum for understanding neuronal com- 
putations and measuring functional interactions is an essential 
development to make use of the massively parallel recordings that 
lie in our near future. 
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APPENDIX 

MATHEMATICAL PROOF OF CONCEPT 

Let y be an n x 1 random vector variable and let £[y] = u,, 
Cov[y] = Y, . In this context, each element y, of y is a spike count 
from the z'th of n neurons observed on a single trial. Each neuron 
i has a distribution of spike counts with mean u,, and variance 
For any n x n symmetric matrix A, the product /Ay is a 
quadratic form of y. It is known (Graybill, 1976, Chapter 4) that 
the expectation of this quadratic form is: 



£[/Ay] = fr{AE} - u/Au, 



where fr{} denotes the trace operator. If we let A 



(1) 



I-Kl/n) 

- 1 



where I is an n x n identity matrix and 1 is an n x n matrix 
of ones, then the quadratic form /Ay is the variance of y. In 
our framework, this quantity represents the single-trial variance 
across the population of neurons. 

If we normalize the spike counts from each neuron by 
z-scoring, i.e., z, = , then E becomes equivalent to the cor- 
relation matrix, C. The expectation of the variance of z then 
becomes: 

Efz'Az] = fr{AC} - O'AO = fr{AC} (2) 

Next we will prove that the expectation of the variance of z equals 
one minus the average correlation between neurons. 

A property of the trace operator is that the trace of the 
inner product of two matrices AC is equal to the sum of the 
Hadamard (or "element- wise") product A' o C (i.e., (A' o C); ; = 
(A')i s ■ (C); ,•)• Since A is symmetric (by definition): 



(3) 



Hereafter, the subscripts i and j will be assumed. Substituting A 
as defined above: 



(4a) 



£((^f~I+-I--l))°C) «4b. 



n — 1 V n n n 



^ n - 1 \n « 1 )) OC 

£G I ° C+ * L T) (I - 1) ° C ) 

E(^c) + E(^a-Doc) 



(4c) 
(4d) 



Since C is a correlation matrix, all diagonal elements are equal 
to 1, meaning I o C = I. Further still, trivially, ^,{yI°C) = 
£(~I) = Lit follows that: 



fr{AC} = 1 



1 



«(n — 1) 



^((l-I)oC) 



(5) 



The term containing the sum is the average of all of the off- 
diagonal elements of C; that is, the average pairwise correlation 
between the population of neurons. From (2), we have that 
the expectation of the single-trial variance across the neuron 
population is equal to tr{ AC} : 



£[z'Az] = fr{AC} = 1 



1 



n(n — 1) 



^((l-I)oC) □ (6) 



Thus, the expectation of the single-trial variance in normalized 
firing rate across a population of neurons is precisely equal to one 
minus the average pairwise correlation between neurons in that 
population. 

This relationship is a generalization of the well-known Law 
of Total Variance, which explains that the total variance in a 
random variable Y, when conditioned on some random vari- 
able X, is equal to sum of the expectation of the variance and 
the variance of the expectation. That is, Var[7] = £[V r ar[7|X]] + 
V r ar[_E[y|X]] (Weiss et al, 2005). These two terms are also 
referred to as the "unexplained variance" and "explained vari- 
ance" (the variance of Y that can be explained by X), respec- 
tively. In our case, we have n random variables (neurons), each 
of which is explained in some part by the other n — 1 vari- 
ables in accordance with the correlation matrix. As mentioned 
above, /Ay is the variance on a single trial, and .Ely 7 Ay] is 
the expectation of that variance (eq. 1). Since \l is the expec- 
tation of y, jjl'Ajjl represents the variance of the expectation. 
Finally, t r{ AE } represents the total variance of the random vector 
variable y. 

Although we have proved here that the expectation of the 
single-trial variance is the complement to correlation, we have 
assumed that the true means and variances (diag(L)) 
of the spike counts (y) must be known in order to stan- 
dardize the variables. In practice, however, these statistics can 
only be estimated, and this limitation will introduce error 
to the estimate of correlation based on the measurement of 
single-trial variance. This is the typical problem of sampling 
error, and it can be mitigated by sampling a greater number 
of observations of spike counts to better estimate the statis- 
tics of the spike count distributions of the neurons under 
observation. 
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